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1 Optimising hot paths in a dynamic binary translator 
^ David Ung, Cristina Cifuentes 

March 2001 ACM SIGARCH Computer Architecture News, Volume 2 

Publisher: ACM Press 

Full text available: ^ pdf(890. 1 0 Additional Information: full citation , abst 

KB) terms 

In dynamic binary translation, code is translated "on the fly" at run-time, 
perceives ordinary execution of the program on the target machine. Cod( 
frequently executed follow the same sequence of flow control over a per 
fragments form a hot path and are optimised to improve the overall perfc 
program.Multiple hot paths may also exist in programs. A program may 
one hot path for some time, but later switch to anot ... 



Keywords: binary translation, dynamic compilation, dynamic execution. 



2 Efficient instrumentation for code coverage testing 
^ Mustafa M. Tikir, Jeffrey K. Hollingsworth 

July 2002 ACM SIGSOFT Software Engineering Notes , Proceedings o 
SIGSOFT international symposium on Software testing and 
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•02, Volume 27 Issue 4 
Publisher: ACM Press 

Full text available: ^ pdf(524.54 Additional Information: full citation , abst 

KB) citings 

Evaluation of Code Coverage is the problem of identifying the parts of a 
not execute in one or more runs of a program. The traditional approach f 
tools is to use static code instrumentation. In this paper we present a new 
dynamically insert and remove instrumentation code to reduce the runtin 
coverage. We also explore the use of dominator tree information to redu( 
instrumentation points needed. Our experiments show tha ... 

Keywords: code coverage, dominator tree, dynamic code deletion, dyna 
on-demand instrumentation, testing 



3 Dynamo: a transparent dynamic optimization system 
^ Vasanth Bala, Evelyn Duesterwald, Sanjeev Banerjia 

May 2000 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLA 
on Programming language design and implementation PLD] 
Issue 5 

Publisher: ACM Press 

Full text available: ^ pdf( 1 56.03 Additional Information: full citation , abst 

KB) citings , index ten 

We describe the design and implementation of Dynamo, a software dyna 
system that is capable of transparently improving the performance of a n 
stream as it executes on the processor. The input native instruction streai 
dynamically generated (by a JIT for example), or it can come from the e: 
statically compiled native binary. This paper evaluates the Dynamo syst€ 
more challenging situation, in order to emphasize the ... 

4 Hot cold optimization of large Windows/NT applications 
Robert Cohn, P. Geoffrey Lowney 

December 1996 Proceedings of the 29th annual ACM/IEEE internation 

Microarchitecture 
Publisher: IEEE Computer Society 

Full text available: ^pdfri.l4 Additional Information: full citation , abst 
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A dynamic instruction trace often contains many unnecessary instructior 
only by the unexecuted portion of the program. Hot-cold optimization (l 
that realizes this performance opportunity. HCO uses profile informatior 
routine into frequently executed (hot) and infi-equently executed (cold) p 
operations in the hot portion are removed, and compensation code is add 
firom hot to cold as needed. We evaluate HCO on a ... 

Keywords: optimization, profile,NT,register allocation 



5 Efficient and flexible value sampling 

^ M. Burrows, U. Erlingson, S.-T. A. Leung, M. T. Vandevoorde, C. A. Wal 
Walker, W. E. Weihl 

November 2000 ACM SIGPLAN Notices, Volume 35 Issue 1 1 
Publisher: ACM Press 

Full text available: ^ pdf(973. 28 Additional Information: fiill citation , abst 

KB) citings , index ten 

This paper presents novel sampling-based techniques for collecting statif 
register contents, data values, and other information associated with insti 
memory latencies. Values of interest are sampled in response to periodic 
resulting value profiles can be analyzed by programmers and optimizers 
performance of production uniprocessor and multiprocessor systems.Oui 
system extends the DCPI continuous profiling infi-astmctu ... 

6 Efficient and flexible value sampling 

^ M. Burrows, U. Erlingson, S-T. A. Leung, M. T. Vandevoorde, C. A. Wak 
W. E. Weihl 

November 2000 ACM SIGOPS Operating Systems Review , ACM SIG 
. Architecture News , Proceedings of the ninth internatii 
Architectural support for programming languages and 
ASPLOS-IX, Volume 34 , 28 Issue 5 , 5 

Publisher: ACM Press 

Full text available: ^ pdf( 191.88 Additional Information: full citation , abst 

KB) citings , index ten 

This paper presents novel sampling-based techniques for collecting statij 
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register contents, data values, and other information associated with insti 
memory latencies. Values of interest are sampled in response to periodic 
resulting value profiles can be analyzed by programmers and optimizers 
performance of production uniprocessor and multiprocessor systems.Oui 
system extends the DCPI continuous profiling infirastructu ... 

7 Exploiting hardware performance counters with flow and context sensitive 
^ Glenn Ammons, Thomas Ball, James R. Lams 

May 1997 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLA 
on Programming language design and implementation PLD] 
Issue 5 

Publisher: ACM Press 

Full text available: ^pdf(1.67 Additional Information: full citation , abst 

MB) citings, index ten 

A program profile attributes run-time costs to portions of a program's exi 
profiling systems suffer from two major deficiencies: first, they only app 
metrics, such as execution fi-equency or elapsed time to static, syntactic i 
procedures or statements; second, they aggressively reduce the volume o 
collected and reported, although aggregation can hide striking difference 
behavior. This paper addresses both concerns by exploiting the har ... 

8 A framework for reducing the cost of instrumented code 
^ Matthew Arnold, Barbara G. Ryder 

May 2001 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLA 
on Programming language design and implementation PLD] 
Issue 5 

Publisher: ACM Press 

Full text available: ^ pdf( 1.78 Additional Information: full citation , abst 

MB) citings , index ten 

Instrumenting code to collect profiling information can cause substantial 
overhead. This overhead makes instrumentation difficult to perform at n 
preventing many known offline feedback-directed optimizations firom be 
systems. This paper presents a general fi-amework for performing instrur. 
to reduce the overhead of previously expensive instrumentation. The frai 
and effective, using code-duplication and coun ... 
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• 9 A dynamic optimization framework for a Java just-in-time compiler 
^ Toshio Suganuma, Toshiaki Yasue, Motohiro Kawahito, Hideaki Komatsu 
October 2001 ACM SIGPLAN Notices , Proceedings of the 16th ACM ! 

conference on Object oriented programming, systems, lai 
applications OOPSLA *01, Volume 36 Issue 1 1 
Publisher: ACM Press 

Full text available: ' ^pdf(2.12 Additional Information: full citation , abst 

MB) citings , index ten 

The high performance implementation of Java Virtual Machines (JVM) ; 
(JIT) compilers is directed toward adaptive compilation optimizations or 
runtime profile information. This paper describes the design and implem 
dynamic optimization framework in a production-level Java JIT compile 
to employ a mixed mode interpreter and a three level optimizing compih 
full, and special optimization, each of which has a differ ... 



10 Profile-based optimizations: Dynamic trace selection using performance m 
sampling 

Howard Chen, Wei-Chung Hsu, Jiwei Lu, Pen-Chung Yew, Dong- Yuan C 
March 2003 Proceedings of the international symposium on Code genei 
optimization: feedback-directed and runtime optimization 
Publisher: IEEE Computer Society 

Full text available: " ^pdffl.SS Additional Information: full citation , abst 

MB) citings, index ten 

Optimizing programs at run-time provides opportunities to apply aggresi 
prograins based on information that was not available at compile time. A 
programs can be adapted to better exploit architectural features, optimize 
libraries, and simplify code based on run-time constants.Our profiling sy 
framework for collecting information required for performing run-time c 
sample the performance hardware registers available on ... 



11 Dynamic hot data stream prefetching for general-purpose programs 
^ Trishul M. Chilimbi, Martin Hirzel 

May 2002 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLA 
on Programming language design and implementation PLD] 
Issue 5 
Publisher: ACM Press 
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Full text available: ^ pdf(2 1 0.85 Additional Information: full citation , abst 

KB) citings , index ten 

Prefetching data ahead of use has the potential to tolerate the growing pr 
performance gap by overlapping long latency memory accesses with use 
While sophisticated prefetching techniques have been automated for lim: 
as scientific codes that access dense arrays in loop nests, a similar level ( 
eluded general-purpose programs, especially pointer-chasing codes writt 
such as C and C++. We address this problem by describing ... 

Keywords: data reference profiling, dynamic optimization, dynamic pro 
performance optimization, prefetching, temporal profiling 



12 Compilation and run-time systems: Vacuum packing: extracting hardware- 
phases for post-link optimization 

Ronald D. Barnes, Erik M. Nystrom, Matthew C. Merten, Wen-mei W. Hv 

November 2002 Proceedings of the 35th annual ACM/IEEE intefnatioi 

Microarchitecture 

Publisher: IEEE Computer Society Press 

Full text available: ' ^pdgl.26 

MB) 9 Additional Information: full citation , abst 
Publisher citings , index ten 

Site 

This paper presents Vacuum Packing, a new approach to profile-based p: 
optimization. Instead of using traditional aggregate or sunmiarized execi 
weights, this approach uses a transparent hardware profiler to automatic? 
phases and record branch profile information for each new phase. The cc 
algorithm then produces code packages that are specially formed for thei 
phases. The algorithm compensates for the incomplete and often incoher 

13 Dynamic Adaptive compilation: Dynamic profiling and trace cache genera 
Marc Bemdl, Laurie Hendren 

March 2003 Proceedings of the international symposium on Code genei 
optimization: feedback-directed and runtime optimization 
Publisher: IEEE Computer Society 

Full text available: " BpdfrQSO.SS Additional Information: full citation , abst 
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Dynamic program optimization is increasingly important for achieving g 
performance. A key issue is how to select which code to optimize. One a 
dynamically detect traces, long sequences of instructions spanning multi 
are likely to execute to completion. Traces are easy to optimize and have 
good unit for optimization.This paper reports on a new approach for dyn 
creating and storing traces in a Java virtual machine. We ... 
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